Combining a self-organising map with memory-based learning

نویسندگان

  • James Hammerton
  • Erik F. Tjong Kim Sang
چکیده

Memory-based learning (MBL) has enjoyed considerable success in corpus-based natural language processing (NLP) tasks and is thus a reliable method of getting a high-level of performance when building corpus-based NLP systems. However there is a bottleneck in MBL whereby any novel testing item has to be compared against all the training items in memory base. For this reason there has been some interest in various forms of memory editing whereby some method of selecting a subset of the memory base is employed to reduce the number of comparisons. This paper investigates the use of a modified self-organising map (SOM) to select a subset of the memory items for comparison. This method involves reducing the number of comparisons to a value proportional to the square root of the number of training items. The method is tested on the identification of base noun-phrases in the Wall Street Journal corpus, using sections 15 to 18 for training and section 20 for testing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of the self-organising map to reinforcement learning

This article is concerned with the representation and generalisation of continuous action spaces in reinforcement learning (RL) problems. A model is proposed based on the self-organising map (SOM) of Kohonen [Self Organisation and Associative Memory, 1987] which allows either the one-to-one, many-to-one or one-to-many structure of the desired state-action mapping to be captured. Although presen...

متن کامل

Multidimensional data classification with artificial neural networks

Multi-dimensional data classification is an important and challenging problem in many astro-particle experiments. Neural networks have proved to be versatile and robust in multi-dimensional data classification. In this article we shall study the classification of gamma from the hadrons for the MAGIC Experiment. Two neural networks have been used for the classification task. One is Multi-Layer P...

متن کامل

A Comparison of the Lbg, Lvq, Mlp, Som and Gmm Algorithms for Vector Quantisation and Clustering Analysis

We compare the performance of ve algorithms for vector quan-tisation and clustering analysis: the Self-Organising Map (SOM) and Learning Vector Quantization (LVQ) algorithms of Kohonen, the Linde-Buzo-Gray (LBG) algorithm, the MultiLayer Perceptron (MLP) and the GMM/EM algorithm for Gaussian Mixture Models (GMM). We propose that the GMM/EM provides a better representation of the speech space an...

متن کامل

Self-Organising Maps in Document Classification: A Comparison with Six Machine Learning Methods

This paper focuses on the use of self-organising maps, also known as Kohonen maps, for the classification task of text documents. The aim is to effectively and automatically classify documents to separate classes based on their topics. The classification with self-organising map was tested with three data sets and the results were then compared to those of six well known baseline methods: k-mea...

متن کامل

Maximum Likelihood Topology Preserving Ensembles

Statistical re-sampling techniques have been used extensively and successfully in the machine learning approaches for generations of classifier and predictor ensembles. It has been frequently shown that combining so called unstable predictors has a stabilizing effect on and improves the performance of the prediction system generated in this way. In this paper we use the resampling techniques in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.CL/0107018  شماره 

صفحات  -

تاریخ انتشار 2001